Dynamic Memory Networks for Visual and Textual Question Answering
نویسندگان
چکیده
Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks. However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images. Based on an analysis of the DMN, we propose several improvements to its memory and input modules. Together with these changes we introduce a novel input module for images in order to be able to answer visual questions. Our new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the bAbI-10k text question-answering dataset without supporting fact supervision.
منابع مشابه
Memory Networks
We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively ...
متن کاملLearning Convolutional Text Representations for Visual Question Answering
Visual question answering is a recently proposed articial intelligence task that requires a deep understanding of both images and texts. In deep learning, images are typically modeled through convolutional neural networks, and texts are typically modeled through recurrent neural networks. While the requirement for modeling images is similar to traditional computer vision tasks, such as object ...
متن کاملEmory N Etworks
We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively ...
متن کاملIncorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks
Visual Question Answering (VQA) has attracted much attention since it offers insight into the relationships between the multi-modal analysis of images and natural language. Most of the current algorithms are incapable of answering open-domain questions that require to perform reasoning beyond the image contents. To address this issue, we propose a novel framework which endows the model capabili...
متن کاملDynamic Inference: Using Dynamic Memory Networks for Question Answering
Question Answering is an incredibly important task in Natural Language Processing (NLP), and we aim to experiment with models in Machine Comprehension to attempt to perform well on Question Answering tasks. We specifically work with the Faceboook bAbi dataset, and aim to achieve strong results using Dynamic Memory Networks, which are known to have strong performance for this task. Dynamic Memor...
متن کامل